Distributed optimization of deeply nested systems
نویسندگان
چکیده
•Many architectures share the fundamental design principle of constructing a deeply nested mapping from inputs to outputs. •Learning these architectures is challenging because nesting (i.e., function composition) produces inherently nonconvex functions. •Backprop suffers from vanishing gradients and is hard to parallelize, is only applicable if the mappings are differentiable with respect to the parameters, and needs careful tuning of learning rates. •Selecting the best architecture, for example the number of units in each layer of a deep net, or the number of filterbanks in a speech front-end processing, requires a combinatorial search. •We describe a general optimization strategy called method of auxiliary coordinates (MAC). It has provable convergence, is easy to implement reusing existing algorithms for single layers, can be parallelized trivially and massively, applies even when parameter derivatives are not available or not desirable, can perform some model selection on the fly, and is competitive with state-ofthe-art nonlinear optimizers even in the serial computation setting, often providing reasonable models within a few iterations. x y z1 z2 z3 W1 W2 W3 W4 σ
منابع مشابه
Parametric Polymorphism Optimization for Deeply Nested Types in Computer Algebra
Computer algebra systems, such as Axiom, and programming languages designed for computer algebra, such as Aldor, have very flexible mechanisms for generic code, with type parameterization. Modern versions of Maple can support this style of programming through the use of Maple's module system, and by using module-producing functions to give parametric type constructors. From the software design ...
متن کاملUnsupervised Submodular Rank Aggregation on Score-based Permutations
Unsupervised rank aggregation on score-based permutations, which is widely used in many applications, has not been deeply explored yet. This work studies the use of submodular optimization for rank aggregation on score-based permutations in an unsupervised way. Specifically, we propose an unsupervised approach based on the Lovasz Bregman divergence for setting up linear structured convex and ne...
متن کاملOptimization of majority protocol for controlling transactions concurrency in distributed databases by multi-agent systems
In this paper, we propose a new concurrency control algorithm based on multi-agent systems which is an extension of majority protocol. Then, we suggest a clustering approach to get better results in reliability, decreasing message passing and algorithm’s runtime. Here, we consider n different transactions working on non-conflict data items. Considering execution efficiency of some different...
متن کاملLoad Model Effect Assessment on Optimal Distributed Generation Sizing and Allocation Using Improved Harmony Search Algorithm
The operation of a distribution system in the presence of distributed generation systems has someadvantages and challenges. Optimal sizing and siting of DG systems has economic, technical, andenvironmental benefits in distribution systems. Improper selection of DG systems can reduce theseadvantages or even result in deterioration in the normal operation of the distribution system. DGallocation ...
متن کاملDevelopment of PSPO Simulation Optimization Algorithm
In this article a new algorithm is developed for optimizing computationally expensive simulation models. The optimization algorithm is developed for continues unconstrained single output simulation models. The algorithm is developed using two simulation optimization routines. We employed the nested partitioning (NP) routine for concentrating the search efforts in the regions which are most like...
متن کامل